Binaural synthesis, head-related transfer functions, and uses thereof

ABSTRACT

A method and apparatus for simulating the transmission of sound from sound sources to the ear canals of a listener encompasses novel head-related transfer functions (HTFs), novel methods of measuring and processing HTFs, and novel methods of changing or maintaining the directions of the sound sources as perceived by the listener. The measurement methods enable the measurement and construction of HTFs for which the time domain descriptions are surprisingly short, and for which the differences between listeners are surprisingly small. The novel HTFs can be exploited in any application concerning the simulation of sound transmission, measurement, simulation, or reproduction. The invention is particularly advantageous in the field of binaural synthesis, specifically, the creation, by means of two sound sources, of the perception in the listener of listening to sound generated by a multichannel sound system. It is also particularly useful in the designing of electronic filters used, for example, in virtual reality systems, and in the designing of an &#34;artificial head&#34; having HTFs that approximate the HTFs of the invention as closely as possible in order to make the best possible representation of humans by the artificial head, thereby making artificial head recordings of optimal quality.

FIELD OF THE INVENTION

The present invention relates to improved methods and apparatus forsimulating the transmission of sound from sound sources to the earcanals of a listener, said sound sources being positioned arbitrarily inthree dimensions in relation to the listener. In particular, theinvention relates to novel uses of certain Head-related TransferFunctions and the production of such Head-related Transfer Functions, aswell as to methods and apparatus using the Head-related TransferFunctions.

BACKGROUND OF THE INVENTION

Human beings detect and localize sound sources in three-dimensionalspace by means of the human binaural sound localization capability.

The input to the hearing consists of two signals: sound pressures ateach of the eardrums. These two sound signals are called binaural soundsignals. The term binaural refers to the fact that a set of two signalsform the input to the hearing. It is not fully known how the hearingextracts information about distance and direction to a sound source, butit is known that the hearing uses a number of cues in thisdetermination. Among the cues are coloration, interaural timedifferences, interaural phase differences and interaural leveldifferences. Thorough descriptions of cues to directional hearing aregiven by J. Blauert: "Raumliches Horen", Hirzel Verlag, Stuttgart,Germany, 1974, and "Spatial Hearing", The MIT Press, Cambridge, Mass.,1983.

This means that if the sound pressures at the eardrums are createdexactly as they would have been created by a given spatial sound field,a listener would not be able to distinguish this sound experience fromthe one he would get from being exposed to the spatial sound fielditself.

One known way of approaching this ideal sound reproducing situation isby the artificial head recording technique. An artificial head is amodel of a human head where the geometries of a human being which areacoustically relevant especially with respect to diffraction around thebody, shoulder, head and ears are modelled as closely as possible.During a recording, e.g. of a concert, two microphones are positioned inthe ear canals of the artificial head to sense sound pressures, and theelectrical output signals from these microphones are recorded.

When these signals are reproduced, e.g. by headphones, the soundpressures in the ear canals of the artificial head during the concertare reproduced in the ear canals of the listener and the listener willachieve the perception that he was listening to the concert in theconcert hall. The signals for the headphones are also called binauralsignals.

The term binaural signals designates a set of two signals, left andright, having been coded using transmission characteristicscorresponding to the transmission to the two ears of the human listener,for instance to be presented in the left and right ear canals,respectively, of a listener.

The binaural signals may typically be electrical signals, but they mayalso be, e.g. optical signals, electromagnetic signals or any other typeof signal which can be transformed, directly or indirectly, into soundsignals in the left and right ears of a human.

The transmission of a sound wave propagating from a sound sourcepositioned at a give n direction and distance in relation to the leftand right ears of the listener is described in terms of two transferfunctions, one for the left ear and one for the right ear, that includeany linear distortion, such as coloration, interaural time differencesand interaural spectral differences. These transfer functions changewith direction and distance of the sound source in relation to the earsof the listener. It is possible to measure the transfer functions forany direction and distance and simulate the transfer functions, e.g.electronically, e.g. by filters. If such filters are inserted in thesignal path between a playback unit such as a tape recorder andheadphones used by a listener, the listener will achieve the perceptionthat the sounds generated by the headphones originate from a soundsource positioned at the distance and in the direction as defined by thetransfer functions of the filters, because of the true reproduction ofthe sound pressures in the ears.

A set of two such transfer functions, one for the left ear and one forthe right ear, is called a Head-related Transfer Function (HTF). Eachtransfer function is defined as the ratio between a sound pressure pgenerated by a plane wave at a specific point in or close to theappertaining ear canal (p_(L) in the left ear canal and p_(R) in theright ear canal) in relation to a reference. The reference traditionallychosen is the sound pressure P₁ generated by a plane wave at a positionright in the middle of the head, but with the listener absent. In thefrequency domain this HTF is given by:

    H.sub.L =P.sub.L /P.sub.1, H.sub.R =P.sub.R /P.sub.1       (1)

where L designates the left ear and R designates the right ear. The timedomain representation or description of the HTF, that is the inverseFourier transform of the HTF, is often called the Head-related ImpulseResponse (HIR). Thus, the time domain description of the HTF is a set oftwo impulse responses, one for the left ear and one for the right ear,each of which is the inverse Fourier transform of the correspondingtransfer function of the set of two transfer functions of the HTF in thefrequency domain.

The HTF depends upon the angle of incidence of the plane wave inrelation to the listener. It gives a complete description of the soundtransmission to the ears of the listener, including diffraction aroundthe head, reflections from shoulders, reflections in the ear canal, etc.

The definitions given in equation (1) were given by J. Blauert:"Raumliches Horen", Hirzel Verlag, Stuttgart, Germany, 1974.

A tutorial about binaural techniques is given by Henrik M.oslashed.ller: "Fundamentals of Binaural Technology", Applied AcousticsNo. 3/4, pp. 171-218, vol. 36, 1992.

As mentioned above, binaural signals may be generated using theartificial head recording and reproducing technique; the artificial headcould be substituted with a test person.

Alternatively, binaural signals may be generated by any means thatsimulate the transmission of sound to the ear canals of humans, such asanalog filters, digital filters, signal processors, computers, etc.

U.S. Pat. No. 3,920,904 discloses a method for creating sound pressuresat the eardrums of a listener by means of headphones, that correspond tosound pressures which would be created at the eardrums of the listenerin a predetermined acoustical environment in response to electricalsignals applied to a number of loudspeakers, comprising measurement ofthe HTFs corresponding to the positioning of the loudspeakers inrelation to the listener and simulation of the HTFs with analogelectronic filters.

It has also been claimed to be possible to design the simulating filtersusing a different approach that does not include a measurement of HTFsbut relies on knowledge of specific cues to directional hearing. Such anapproach is disclosed in U.S. Pat. No. 4,817,149, where a front/back cueis generated by a spectral bias, elevation by a notch filter, andazimuth by a time-shift between the two channels.

BRIEF DISCLOSURE OF THE INVENTION

The present invention is based on intensive research in the field ofbinaural techniques and provides high quality HTFs as well as a numberof other improvements of the binaural techniques and other techniques inwhich HTFs are used.

Thus, the invention provides, inter alia, new and improved methods formeasurement of HTFs, new and improved HTFs, new and improved methods forprocessing HTFs, new methods of changing, or of maintaining, thedirections of the sound sources as perceived by a listener, and as oneof the most important utilizations thereof, new methods for binauralsynthesis.

One object of the present invention is to provide HTFs for which thedifferences between the gains, in the frequency domain, of a HTF fromone human to another are very low, or the differences between thecorresponding time domain descriptions of the HTFs are very low. Theinventors have carried out a major study of a number of HTFs for anumber of different individuals, for a number of different directions,and for a number of different measurement points in the external ear ofthe individual, i.e. inside the ear canal or in the vicinity of theentrance to the ear canal. During this study the inventors have improvedthe measurement method so that it is now possible to measure and/orconstruct HTFs for which the time domain descriptions are surprisinglyshort and for which the differences from one individual to the other aresurprisingly low.

According to the present invention, a group of HTFs with advantageousfeatures has been provided that can be exploited in any applicationconcerning measurement or reproduction of sound, such as in the designof electronic filters used in the simulation of sound transmission froma sound source to the ear canals of the listener or in the design of anartificial head that is designed so that its HTFs approximate the HTFsof the invention as closely as possible in order to make the bestpossible representation of humans by the artificial head, e.g. to makeartificial head recordings of optimum quality.

Further, the present invention provides methods of extracting orconstructing, for each direction of a sound source in relation to thelistener, a function that represents the human HTFs of a group of humanswhich function can be used as the design target in differentapplications, such as the design of an artificial head or the design ofsignal processing means.

Still further, the present invention provides a new method ofinterpolation whereby a virtual distance and direction of a virtualsound source can be created based upon transfer functions correspondingto different directions.

DETAILED DISCLOSURE OF THE INVENTION

One main aspect of the invention relates to a method of generatingbinaural signals by filtering at least one sound input with at least oneset of two filters, each set of two filters having been designed so thatthe two filters simulate the left ear and the right ear parts of aHead-related Transfer Function (HTF), the method showing at least one ofthe features a)-c)

a) the HTF is used generally for a population of humans for which thebinaural signals are intended, the HTF being determined in such a mannerthat the standard deviation of the amplitude, in dB, between subjects,over at least a major part of the frequency interval between 1 kHz and 8kHz is at the most as shown in FIG. 22 for at least one of the curvesthereof,

b) the duration of the time domain representation of the transferfunction of the filters simulating the HTF is at the most 2 ms,

c) the value at zero Hertz of the frequency domain description of thetransfer function of the filters simulating the HTF is in the range from0.316 to 3.16.

With respect to feature a):

An important aspect of the invention relates to the utilization of"general" HTFs in binaural synthesis. The term "general" refers to thevery desirable fact that it is now possible to generate binaural signalsusing "general" HTFs that typically differ from the HTFs of a listenerand still provide to the listener a high quality auditive experiencewith a high quality of sound reproduction and a distinct localization ofthe virtual sound sources. A "general" HTF or a set of general" HTFs canbe defined as an HTF for an individual subject of a population or a setof HTFs for individual subjects of a population, for a particular angleof sound incidence, the HTF or HTFs being determined in such a mannerthat the standard deviation of the amplitude, in dB, between subjects,over at least a major part of the frequency interval between 1 kHz and 8kHz is at most as shown in FIGS. 22-24 for at least one of the curvesthe of the figure in question. In the present context, the term "over amajor part of the frequency interval" indicates that in the logarithmicrepresentation of FIGS. 22-24, the standard deviation will be at themost a value identical to the value of the curve at the frequency inquestion over a major part of the frequency interval, seen in the samelogarithmic representation. In other words, the condition is compliedwith when, over at least 51% of the millimeters of X axis representingthe frequency range between 1 kHz and 8 kHz, the standard deviation isless than or at the most identical to the value represented by the curvein question. This definition does not indicate that the standarddeviation will be higher than the curve value in the range of 100 Hz to1 kHz which is also shown in the figures--will always or almost alwaysbe lower than the curve value or at the most identical with the curvevalue, but the definition focuses on the part of the curve, between 1kHz and 8 kHz, which is much more critical with respect to "generality".It is, of course, preferred that the condition is complied with over ahigher proportion of the frequency range, such as at least 75% or atleast 90%, and most preferred that it is complied with at allfrequencies such as is the case in the results reported herein, but eventhe least stringent condition defined above will represent a high degreeof generality.

As appears from FIGS. 22-24 and the appertaining discussion, extremelylow variations can be obtained and have been obtained between subjects,in particular for the most important angles of sound incidence. Thismeans that "general" high quality HTFs can now be used for all thevarious purposes for which HTFs are used, thus very significantlyincreasing the practical commercial usefulness of HTFs and techniquesrelated thereto, such as binaural techniques, in particular binauralsynthesis.

As the anatomy of humans shows a substantial variability from oneindividual to the other and as the HTFs of a human among other thingsare determined by diffractions and reflections around the head and pinnaand the transmission characteristics through the ear canals, it isintuitively understood that the HTFs are different for differentindividuals. In the prior art, these differences are considered to belarge. Experiments have been performed where binaural signals have beengenerated using HTFs from another person than the listener, whereby thelisteners auditive experience have been disappointing, among otherthings due to a diminished ability of localizing the virtual soundsources from the binaural signal. Thus, in the art, the variability ofHTFs among humans is considered to be a major impediment for the use ofone set of HTFs for different listeners. For example, it is reportedthat: "Substantial intersubject variability in the HRTF for a singlesource position is to be expected, given differences in head size andpinna shape. This HRTF variability has been reported before (Shaw 1966)and is prominent in our data. (. . .) FIG. 3 shows that variability inHRTF from subject to subject grows with frequency until it reaches apeak of almost 8 dB between 7 and 10 kHz", F. L. Wightman and D.Kistler, "Headphone Simulation of Free-Field Listening, I: StimulusSynthesis, II: Psychoacoustical Validation," J. Acoust. Soc. Am. Vol.85(2), pp. 858-878, 1989. The data reported are 1/3 octave noise bandsvalues.

However, it is a major achievement of the present invention that it hasnow been found that it is possible to provide or determine an HTF (A)for a particular angle of sound incidence which is so close tocorresponding individual HTFs that the function HTF (A) will satisfyeven critical quality demands by almost all potential users for whichthe function is intended, in contrast to the widespread belief in theart that HTF would have to be adapted to the individual user to achievea satisfactory quality in the practical uses of the HTF. In practice,this will mean that the use according to the invention of the HTF (A)will result in a higher quality in almost all situations of use, andthus a general improvement. This is illustrated in more detail later inthe description with reference to FIG. 8.

The ability of the HTF (A) to be close to corresponding individual HTFs,or, expressed in another manner, to be member of a group of HTFsdetermined with a low standard deviation, is quantitatively described bythe conditions mentioned above with respect to FIGS. 22-24. The HTFs areconsidered to have the quality of generality when the standard deviationis at the most as shown in FIG. 22 for at least one of the appropriatecurves of FIG. 22.

The properties of the HTF complying with the criteria of FIG. 22 for apopulation, such as, e.g., U.S. astronauts or Scandinavian teenagers,or, quite generally, a population for which the product of the binauralsynthesis is intended or primarily intended, can, thus, also beexpressed by the square root of the mean of the squared differencesbetween

the amplitude, given in dB for third octave noise, of the HTF

and

the amplitudes, given in dB for third octave noise for a group ofrandomly selected individual HTFs of the population, being at the most2.2 times the standard deviation as shown in FIG. 8 for the majority ofthe third octave frequencies shown, preferably at the most 1.7 times thestandard deviation as shown in FIG. 8, more preferably at the most 1.4times the standard deviation as shown in FIG. 8, and most preferably atthe most 1.2 or even 1.1 times the standard deviation as shown in FIG.8.

In the assessment of whether an HTF fulfils these "generality"qualities, the individual HTFs (of a representative number ofindividuals of the population) to be compared with the HTF in questioncould be determined for a particular angle of sound incidence, aparticular distance, a particular reference point for the HTFs, and aparticular posture, the determination being performed so that therepeatability of the measurement, expressed in terms of standarddeviation of the amplitude, in dB, between repeated measurements, is atthe most 1/2 times the standard deviation shown in FIG. 8. Theassessment will, of course, be most appropriate and valuable ifproviding such parameters with respect to sound incidence, referencepoint and posture which correspond to the ones used in the originaldetermination of the HTF or the ones which the HTF is adapted tosimulate. While the description which follows discloses a number ofspecific methods for measuring and/or constructing HTFs so that theywill comply with the generality criterion, the above assessmentprinciple can be said to be a general way of judging the suitability ofa candidate HTF for a particular use, or of judging whether an HTFimplemented for a particular use is within the scope of the presentinvention.

While partial or full conformity, as discussed above, with the criteriaillustrated in FIG. 22 can be said to be a basic requirement for the"generality" of an HTF, it is preferred that the HTFs fulfil, at leastwith respect to one of the curves, the more stringent criteriaillustrated in FIG. 23 or even, at least with respect to one of thecurves, the still more stringent criteria illustrated into FIG. 24. Itshould be noted that the reason why the curves relating to the 1/3octave measurement are positioned lower than the pure tone curves isthat the 1/3 octave curves are frequency averages. It will be understoodthat analogously to the criteria of FIG. 22, it is preferred, on eachlevel of increasing stringency as defined by FIG. 23 and FIG. 24, thatthe HTFs fulfil the criteria for at least one of the appropriate curvesof the figure in question.

It will be understood that while the above conditions or criteria define"general" HTFs for a broad population, there are certain evidentcriteria for what constitutes a population in the sense of the presentdisclosure, these criteria being associated with the anatomy of the earsand other anatomic characteristics of the population. Thus, it ispresumed that a set of HTFs determined for a group of adults will not beoptimal "general" HTFs for a population of small children. However, thisdoes not introduce any uncertainty in the present context, as it hasbeen found, as discussed above, that the generality criteria for aparticular population will be fulfilled when the criteria of FIG. 22,preferably FIG. 23 and more preferably FIG. 24 are fulfilled for thepopulation in question, that is, when an assessment as discussed abovehas been made on a representative (with respect to number and variation)subpopulation of the population in question, e.g. 25 persons of thepopulation, or preferably more persons.

With respect to feature b):

According to the invention, it has surprisingly been found that it ispossible, without any significant loss in quality, to reduce theduration of the time domain representation of high quality HTFs, i.e.high quality HIRs, used in binaural synthesis to 2 ms or even lower.This will very considerably reduce the demands to computer power whensimulating the HTFs. When generating binaural signals, a sound inputsignal is typically convoluted with the HIR. The terms "the duration ofthe time domain representation of a HTF" or equivalently "the durationof the HIR" refer to the length in time of that part of the HIR that isused for convolution of the sound input signal. Reduction of theduration of the time domain representation of a HTF or equivalentlyreduction of the duration of the HIR refers to the fact that a shorterpart of the HIR is used for the convolution of the sound input signal.As short HTFs (or HIRs) have been provided according to the presentinvention, high quality HTFs implemented by means of digital filters cannow be handled by moderate computing resources. The time domainrepresentations of HTFs reported in the prior art range from 2.9 ms andup. When evaluating the duration of Head-related Impulse Responses it isimportant to study its frequency response. Examples are reported wherean apparently short pulse can not be truncated to less than a fewmilliseconds as the truncation changes its frequency response to anunacceptable extent because the impulse contains essential informationover a longer time duration. It has been found that this is not the casefor the high quality impulses determined as disclosed herein orotherwise complying with the criteria underlying the present invention,as illustrated below with reference to FIG. 9 and FIG. 10.

The quality of the HTFs obtained by the inventors have been proven byexperiments wherein truncated versions of the HTFs obtained have beenused for binaural synthesis. A panel of listeners have compared soundreproductions based on the truncated and the non-truncated versions ofthe same HTF and it was found that the HTFs obtained by the inventorscould be truncated to the durations mentioned above without loss ofquality of the audible impression perceived by the listener, thelistening test being a three-alternative-forced-choice test. It will beunderstood that in this aspect of the invention, this kind of test is ageneral test which can be used to assess the truncatability of any HTF.

The literature contains disclosures of certain short impulses which arenot proper HTFs according to the general definition. For exampletransfer functions are reported where the pressures p in the ear canalsare not divided by p₁ and therefore these measurements are notmeasurements of the HTFs but measurements of the combined transferfunctions of the loudspeaker and the HTFs.

While the use of HTFs of duration of 2 ms is believed to be unique tothe present invention, it has been found possible to use even shorterparts of HTFs, such as at the most 1.5 ms or shorter, e.g. at the most1.2 ms or 1 ms or even down to at the most 0.9 ms or 0.75 ms or at themost 0.5 ms.

One criterion which should normally be observed in connection with theuse of such short HTFs is that they should comply with certainrequirements with respect to their DC value, such as described below inconnection with feature c). While it is possible to use Htfs as short asdescribed above without any DC adjustment, a normal precaution preferredby the inventors as a routine measure is to adjust the DC value of theshort HTFs in accordance with the teaching given in connection withfeature c).

With respect to feature c):

According to this feature, the value at zero Hz of the frequency domainrepresentation of the HTF is in the range from 0.316 to 3.16, preferablyin the range from 0.5 to 2, such as in the range from 0.7 to 1.4, morepreferably in the range from 0.8 to 1.2, such as in the range from 0.9to 1.1, and most preferably in the range from 0.95 to 1.05, andoptimally set to 1.0.

Until the present invention, the value at zero Hz of the frequencydomain representation of the HTF (the DC value of the HTF) seems to haveattracted little or no attention in the art. However, the research anddevelopment of the present inventors has revealed that the DC value hasa significant influence on the frequency domain representation of theHTF thereby influencing the sound quality, such as coloration, when theHTF is used in sound reproduction.

When HTFs have been measured, the DC value of the HTF is not measured assound transducers are not able to generate a static sound pressure.Therefore, the DC value measured is related to secondary characteristicsof the measurement set-up that often is not accurately controlled, suchas DC offsets in the measurement amplifiers, and the DC values measuredare not related to the HTFs under measurement.

The theoretical DC value of the HTFs is 1 as static sound pressure isnot altered by the presence of the listener. Further, no diffractionoccurs around the head at low frequencies and therefore the soundpressures at different points tend to be identical at lower frequencies.Measuring a value different from 1 corresponds to adding a constant inthe time domain representation of the HTF or to add a sine function tothe frequency domain representation of the HTF which changes theappearance of the frequency response significantly, especially at lowerfrequencies and this changes the sound quality when the HTF is used forbinaural synthesis. This is further illustrated below with reference toFIG. 11 and FIG. 12.

Thus, according to the present invention the DC value of the measuredHTF is adjusted to be in the range from 0.316 to 3.16 preferably in therange from 0.5 to 2, such as in the range from 0.7 to 1.4, morepreferably in the range from 0.8 to 1.2, such as in the range from 0.9to 1.1, and most preferably in the range from 0.95 to 1.05, ideally 1,either directly in the frequency domain representation of the HTF or byadding a constant to the time domain representation of the HTF.

Further, the method of adjusting the DC value to be within an adequaterange of the correct value of the HTF has the advantage that thefrequency values of the HTF between the value of the lowest frequencymeasured and zero Hz is interpolated between these two values whereasextrapolation has to be used when adjustment of the DC value is not usedand extrapolation leads to less accurate results and even in some casesto very poor results.

In many applications of the method of the invention, it is desired tosimulate more than one sound source, and thus, for many practicalembodiments of the method, the at least one sound input is filtered withat least two sets of two filters, (FIG. 26) each set of two filtershaving been designed so that the two filters simulate the left ear andthe right ear parts of a Head-related Transfer Function (HTF), or withat least three sets of two filters, (FIG 27) each set of two filtershaving been designed so that the two filters simulate the left ear andthe right ear parts of a Head-related Transfer Function (HTF), and so onfor at least four sets of two filters, at least five sets, etc.

In the following, a number of measures which have been found by theinventors to be valuable in the measurement and/or construction of HTFsare discussed. As appears from the discussion, these measures, andcombinations thereof, have resulted in HTFs of qualities which must bebelieved to be hitherto unattained, and several such HTFs for a numberof angles of sound incidence are disclosed specifically herein, inparticular in the drawings. These HTFs and combinations thereof arebelieved to be novel per se and, like the novel measures for themeasurement and/or construction of HTFs, constitute aspects of thepresent invention. As will be understood, these HTFs show the featuresidentified under a)-c) above and, thus, their use constitutes preferredembodiments of the binaural synthesis aspect of the invention. However,it will also be understood that the invention is not limited to the useof these HTFs or to HTFs measured or constructed using the specialtechniques disclosed herein, but encompasses the novel use of any HTF orcombination of HTFs, irrespective of how it was determined/provided, aslong as the HTF or the combination shows the characterizing featuresdefined herein.

As described in the above mentioned tutorial and by Hammersh.o slashed.iand M.o slashed.ller. "Sound Transmission to and within the Human EarCanal", submitted for the Journal of the Acoustical Society of America,December 1994, the inventors' research and development have revealedthat the transmission of sound pressures from one point to another inthe ear canal is independent of the angle of sound incidence. Theconsequence of this is that the physical location of a point, where fulldirectional information is present, may be chosen anywhere from theeardrum to the entrance of the ear canal. Possibly, even points a fewmillimeters outside the ear canal and in line with it, may be used. Ithas also been shown that full directional information is present at theentrance to a blocked ear canal. Further, it has been shown by theinventors that a major part of the individual differences of soundtransmission to the eardrums of different humans is caused by individualdifferences of the sound transmission along the ear canal. Therefore,the inventors presently prefer to measure the HTFs at the entrance tothe blocked ear canal as full directional information has been shown tobe present at this point and the individual differences between the HTFsof different humans have been estimated to be minimal at this point.

According to research of the inventors this is related to the fact thatmeasurements at the entrance of the blocked ear canal is not related tothe remaining sound transmission to the eardrum, since statisticalanalysis reveal that HTFs measured at the entrance of the blocked earcanal is uncorrelated with the remaining part of the sound transmission.According to the inventors this quality is evidently not maintained inmeasurements at other points in the ear, e.g. at the entrance of theopen ear canal.

Measurement at the entrance to the blocked ear canal has previously beendemonstrated to reduce the standard deviation between measurements, butthe above surprising recognition that it is possible, using inter aliathis measure, to arrive at "general" HTFs, realistically useful for apopulation, as contrasted to the individual approach previously believedto be necessary in high quality binaural synthesis, is novel andimportant.

The measurement of sound pressures at the entrance to the blocked earcanal has the further advantage that it is relatively easy to mount amicrophone at this point. The inventors prefer to integrate the ear plugand the microphone.

Thus, according to a preferred embodiment of the invention, thereference point of the HTF or the HTFs is at the entrance, or close tothe entrance, to the blocked ear canal.

The reference point (where the measuring microphone is arranged) may beoutside the ear canal, or it may be inside the ear canal. If it isinside the ear canal, the blocking of the ear canal is positioned deeperin the ear canal. The reference point is normally at most 0.8 cm fromthe entrance to the blocked ear canal. More preferably, it is at most0.6 cm from the entrance to the blocked ear canal, most preferably atmost 0.3 cm from the entrance to the blocked ear canal, and ideally justat the entrance. Typically, the blocking of the ear canal is performedby means of a conventional ear plug, preferably of a compressible foamplastic material which, in the ear canal, will expand to completely fillout the ear canal across.

As mentioned above, the present invention provides a number of qualityimprovements of the principles according to which HTFs are measured, andthe conditions under which they are measured. These improvements arereflected and manifested in the quality and utility of the new HTFsaccording to the invention. Thus, an aspect of the invention relates tothe use of an HTF that has been established using at least one of thefollowing measures a)-h):

a) the sound pressure p₂ from a spatially arranged sound source has beenmeasured at the entrance, or close to the entrance, to the blocked earcanal of a person or of an artificial head,

b) the sound pressure p₁ from the sound source has been measured at aposition between the ears of the test person or of the artificial head,with the test person or the artificial head absent,

c) the frequency domain description of the HTF has been calculated bydividing the frequency domain description of p₂ by the frequency domaindescription of p₁, optionally followed by low-pass filtering,

d) the time domain description of the HTF has been obtained by InverseFourier transformation of the frequency domain description,

e) for a particular direction in relation to the test person or theartificial head, the left and right ear parts of the HTF have beenmeasured simultaneously,

f) the test person has been standing during the measurement of the HTF,

g) the test person has been monitored by visual means such as video toensure that the position of the head of the test person was not changedduring the measurement of the HTF and/or any measurement of an HTFduring which the position of the head differed from the correct positionhas been discarded,

h) the test person himself monitored the position of his head e.g. bymeans of mirrors or a video monitor in order to keep his head in thecorrect position during measurement of the HTF,

i) the measurements were carried out in an anechoic chamber, themeasurement time for one HTF being at the most 5 seconds, preferably atthe most 3 seconds, more preferably at the most 2 seconds, such as about1.5 seconds.

In several disclosures of the prior art, the HTFs have been measured inan anechoic chamber, by establishing a sound field using a loudspeakeras the sound source followed by the measurement, frequency by frequency,of p₂ and then of p₁ or vice versa. The HTF is then calculated bydividing p₂ by p₁. However, this method only provides the gain of theHTF and the phase remains unknown.

Some prior art literature discloses measurements of the HTFs that do notinclude measurement of p₁. This means that the HTFs disclosed are notreal HTFs but transfer functions that combine the transfer function ofthe loudspeaker used with the transmission of sound pressures from theloudspeaker to the point where the sound pressures has been measured. Ifthe combined transfer function is used to reproduce binaural soundsignals the listener will perceive the sound reproduced to be played bythis loudspeaker.

Thus, it is an important aspect of the invention that the sound pressurep₁ created by a sound source has been measured at a position between theears of the test person, with the test person absent, and the frequencyand time domain representations of the HTF have established as describedabove.

The optional low-pass filtering is performed to avoid the effect of therelatively low measurement values obtained at frequencies close to halfthe sampling frequency mainly defined by the frequency characteristicsof the loudspeakers and microphones and the anti-aliasing filters usedin the measurement set-up. The division of the two sound pressures inthis frequency range has been seen to create significant peaks andvalleys in the frequency domain representation of the HTF if notfollowed by the low-pass filtering.

The simultaneous measurement of the two HTFs (for the left and the rightear) ensures that the position and orientation of the head of the testperson or the artificial head is not changed between measurement of theHTF and/or that the time references of the measurements of the HTF areidentical.

The fact that the time differences between the arrival of soundpressures from a specific sound source to the left ear and the right earof the listener is one of the most important parameters in soundlocalization. It is very important to determine this parameter, theinteraural time difference, accurately. If the measurement of the HTF isnot carried out simultaneously for the two ears, the ears of the testperson has to be kept in the same position within millimeters during thetwo measurements. For example a movement of 1 cm of the head of the testperson corresponds to a time difference of 30 μs and an uncertainty ofthe determination of the interaural time difference of this magnitudewill typically influence the quality of the HTFs significantly.Therefore, the inventors have chosen the more practical and accuratesolution to measure the HTF simultaneously for the two ears.

When performing measurements of HTFs, it is most commonly prescribed inthe art to use a seated test person during measurements as a seated testperson is well supported and thereby in a good position to keep the headin a fixed position during measurements. The disadvantage of this methodis that reflections from the knees prolong the impulse responses. As thepresent inventors have found no indications contradicting the generalunderstanding that there is no difference in sound localization abilityof a sitting and a standing person they have preferred to use a standingtest person during their measurements to obtain as short impulseresponses as possible. However, this solution requires good support ofthe position of the test person, while simultaneously avoidingreflections from the supporting means. As illustrated in FIG. 6, thetest person is supported at the lumbar region where the support does notcause any sound reflections. Further, the duration of a measurement iskept very short which eases the task of the test person of not movingthe head during measurement. The duration of a measurement is 1.5seconds which represents an optimum choice for signal to noise ratio andmeasurement duration.

Further, the test person has preferably been monitored by visual means,such as video, to ensure that the position of the head of the testperson has not been changed during the measurement of the HTF.

If a movement of the head of the test person is detected during ameasurement of the HTF, it has been preferred to discard such ameasurement.

To assist the test person in keeping his head in a fixed position duringthe measurement the test set-up included a video monitor so that thetest person himself could monitor the position of the head in order tokeep the head in a correct position during measurement.

Having measured the HTFs for a group of test persons and for a set ofdirections to a set of sound sources in relation to the test person itis now possible to construct an HTF (A) that for a given directionrepresents the measured HTFs corresponding to this direction.

One way of doing this is to select one of the HTFs measured as the HTF(A) after adjustment of the DC value to the range previously described.

The selected HTF (A) should be the one that for most persons provide asound experience of a high quality when the HTF (A) is used to reproducesound, e.g. by means of play back of sound recordings through filterswith transfer functions that correspond to the selected HTFs (A), asdescribed in more detail below.

One aspect of the invention relates to an HTF (A) obtained from HTFs (B)obtained according to any of methods described above for at least twotest objects, a test object being a person or an artificial head, byselecting an HTF which, when used in binaural synthesis, gives a soundimpression which, when presented to a test panel, is found to give ahigh degree of conformity with real life listening to a sound source inthe direction in question. Such a test is described in greater detail inthe following.

Another related aspect of the invention is an HTF (A) obtained from HTFs(B) obtained according to any of methods described above for at leasttwo test objects, a test object being a person or an artificial head, byselecting an HTF which, when described objectively, e.g. in thefrequency or the time domain, shows a high degree of similarity toindividual HTFs of a population. Also this aspect is described ingreater detail below. For a specific direction one criterion could be toselect the HTF as the HTF (A) for which the sum of differences betweenthe appertaining HTF and the other HTFs measured are minimal. Thedifference can be defined as the absolute value of the differencebetween two measured values of the corresponding HTFs or the squaredvalue of the difference or any other function of the difference betweentwo measured values of the corresponding HTFs. For a specific directionthis means that for each HTF measured the difference between this HTFand each of the other HTFs of the set of HTFs measured is calculated foreach time sample (or for each time sample of a selected subset of timesamples) of the time domain representation of the HTFs or for eachfrequency sample (or for each frequency sample of a selected subset offrequency samples) of the frequency domain representation of the HTF arecalculated and all the calculated differences are then added to form aresulting sum. When performing the summation weight factors can bemultiplied to the calculated values. Then the HTF with the leastresulting sum is selected as the HTF (A).

The representing HTF (A) can also be calculated on the basis of themeasured HTFs, for at least two test objects, a test object being aperson or an artificial head, by averaging, in the frequency domain, theamplitude of the HTFs (B), the amplitude averaging being performed,e.g., on pressure, power or logarithmic basis, followed by minimum phaseor zero phase construction to obtain an HTF, the averaging beingoptionally followed by addition of a linear phase component giving aninteraural time difference, the linear phase component or the interauraltime difference suitably being obtained in a separate averaging of thelinear phase components or the interaural time differences of theoriginal HTFs (B). This method of constructing an HTF (A) is possibleonly because it has been found feasible, according to the presentinvention, to obtain measured HTFs which are very similar to each other.As a result of the fact that the deviations between HTFs according tothe present invention are very low, it has become possible andrelatively easy to recognize and utilize specific features of the HTFs,such as significant peaks and notches of the HIRs, amplitude peaks ofthe HTF, etc. Thus, an HTF (A) may be obtained from HTFs (B) for atleast two test objects, a test object being a person or an artificialhead, by averaging characteristic parameters of the HTFs (B), thecharacteristic parameters for instance being the frequency and theamplitude of characteristic points, e.g. peaks or notches, or thefrequency of 3 dB points of peaks or notches, when the HTFs (B) aredescribed in the frequency domain, or, the time and the amplitude ofcharacteristic points, e.g. a characteristic positive peak or acharacteristic negative peak, or the time of a characteristic zerocrossing, when the HTFs are described in the time domain, or, thecoordinates of, or the characteristic frequency and the Q-factor ofpoles and zeroes, when the HTFs are described in the complex s- orz-domain.

A set of HTFs that represent the HTF (B)s measured for a set ofdirections to sound sources can be constructed according to the abovedescribed methods in such a way that the methods chosen for theconstruction of HTFs (A) for different specific directions could bechosen to be identical or different as considered advantageous for theactual application.

Further, a set of HTFs (A) could be constructed as described above butwhere one subset of the HTFs (A) could be constructed from HTFs (B)measured on a group of test persons while other subsets of HTFs (A)could be constructed from HTFs (B) measured on different groups of testpersons.

An important aspect of the invention is an HTF (A) obtained from HTFs(B) for at least two test objects, a test object being a person or anartificial head, by averaging in the time domain or in the frequencydomain

a) the time-aligned HTFs (B), the time alignment being performed, e.g.,by

1) alignment to the onset of the pulse or to the first peak, or

2) alignment to maximum cross-correlation, or

b) the HTFs (B) from which the linear phase part and/or the all-passphase part has been removed,

the averaging being optionally followed by addition of a linear phasecomponent giving an interaural time difference, the linear phasecomponents or the interaural time difference suitably being obtained ina separate averaging of the linear phase components or the interauraltime differences of the original HTFs (B). The frequency axis, or asection or sections thereof, or the time axis, or a section or sectionsthereof, may have been compressed or expanded individually for each HTFto reduce the differences between the HTFs before the averaging.

A set of HTFs relating to at least two angles of sound incidence mayconsist of HTFs obtained according to any of the above-describedprinciples. The set may comprise HTFs (A) each of which has beenindividually selected among HTFs, not necessarily among HTFs from thesame origin, preferably using the real life listening selection methodmentioned above.

The invention provides a number of specific high quality HTFs which arecompletely defined. Thus, the invention relates to an HTF (A) which isselected from the group consisting of the 97 HTFs shown in each of FIG.1, FIG. 2 and FIG. 3. These HTFs, described as in the figures, or in theform of tables, are extremely valuable commercial tools with hithertounattainable quality, in any kind of technique where HTFs are used.

The invention also provides HTFs which are useful derivativesconstructed on the basis of the above specific HTFs, namely HTFsobtained by interpolation between two or more of the 97 HTFs shown ineach of FIG. 1, FIG. 2 and FIG. 3, or HTFs which, when used for binauralsynthesis gives an audible impression which is not clearly differentfrom the impression given by an HTF (D) shown in any of the figures inquestion or obtained by interpolation therebetween. In this context, theterm "clearly different" means that a panel of inexperienced listenersobtain a score of at least 90 percent, preferably at least 80 and morepreferably at least 70 and most preferably at least 50, percent correctanswers when the two HTFs (A) and (D) are compared in a balancedfour-alternative-forced-choice test, using programme material for whichthe HTFs are used or for which the HTFs are intended to be used.

For any preferred HTF (A) according to the invention,

a) the reference point of the HTF (B) or the HTFs (B) is at the entranceor close to the entrance, to the blocked ear canal, and the HTFs (B)have been obtained from a group of test persons that is representativefor the group of users for whom the HTFs (A) are intended, and/or

b) the HTF (A) is one which, when used for binaural synthesis, gives anaudible impression which is not clearly different from the impressiongiven by an HTF (D) according to a).

An HTF or a set of HTFs as described herein may be adapted to anindividual listener or a group of listeners by modifying the interauraltime difference of the HTF or the set of HTFs, the modification beingbased on

a) the physical dimension of the listener or the listeners, such as headdiameter, distance between the ears, etc., or

b) a psychoacoustic experiment, where the HTF or the set of HTFs is usedfor binaural synthesis and the interaural time difference for each angleof a selected set of angles of sound incidence is adjusted so that thesound impression as perceived by the individual listener or the group oflisteners is found to give a high degree of conformity with real lifelistening to a sound source in the direction in question.

Certain aspects of the invention relate to the construction of HTFs byapproximation. These aspects are very valuable in many contests, e.g.for small changes in position or orientation of the head. Thus, in oneaspect of the invention, an approximate HTF for an angle of soundincidence may be obtained by interpolating HTFs corresponding toneighbouring angles of sound incidence, the interpolation being carriedout as a weighted average of neighbouring HTFs, the averaging procedurepreferably being performed as described above. In another aspect, anapproximated HTF (A) can be made on the basis of a nearby HTF (B) byperforming an adjustment of the linear phase of the HTF (B) to obtainsubstantially the interaural time difference pertaining to the angle ofincidence for which the approximated HTF (A) is intended.

One aspect of the invention relates to a method of obtaining anapproximate HTF for a short distance between the listener and the soundsource, comprising

a) combining

the left ear part of an HTF representing the geometric angle from thesource position to the left ear position or optionally, if the left earis not visible from the source position, the geometric angle from thesource position tangentially to the part of the head obscuring the ear,with

the right ear part of an HTF representing the geometric angle from thesource position to the right ear position or optionally, if the rightear is not visible from the source position, the geometric angle fromthe source position tangentially to the part of the head obscuring theear,

and/or

individually adjusting the level of the left ear and the right ear partsof the HTF. The individual adjustment of the level of the left ear andthe right ear parts of the HTF may be performed in accordance with thedistance law for spherical sound waves, using the geometrical distanceto the middle of the head and the geometrical distance to each of thetwo ears or optionally, where an ear is not visible from the sourceposition, the geometrical distance to the tangent point of the part ofthe head obscuring the ear or to the ear passing the tangent point andfollowing the curvature of the head.

As described above, one of the applications of the HTF (A) is to use aset of HTFs (A) as a design target for signal processing means, such asa set of digital filter pairs, used to simulate the transmission ofsound from a set of (fictive) sound sources to the left and right earsof the listener. The transfer functions of the set of digital filterpairs are designed to correspond to the appertaining HTFs (A). Abinaural signal is generated by filtering a set of sound signalscorresponding to the set of (fictive) sound sources with the set ofdigital filter pairs.

Thus, an HTF may be obtained from the above HTFs according to theinvention by further processing, such as filtering, equalizing,delaying, modelling, or any other processing that maintains theinformation contents inherent in the original HTF or set of HTFs, thesaid further processing being substantially identical for the left andright ear parts of the HTF, or for a set of HTFs corresponding todifferent angles of sound incidence being substantially identical forthe different directions but not necessarily identical for the left andthe right ear parts of the HTFs.

Examples of such signal processing which are useful in variousapplications are signal processings which have been performed so that

a) the HTF of a specific angle, e.g. in the frontal plane, has a flatfrequency response, or

b) the amplitude of a binaural signal formed by binaural synthesis of adiffuse sound field is substantially identical to the amplitude of thediffuse sound field itself, or

c) the amplitude of a binaural signal formed by binaural synthesis of aspecific sound field is substantially identical to the amplitude of thesound field at the p₁ reference point.

In some practical uses of the method of the invention, e.g., mixingconsoles, at least two sound inputs (1) are combined into one soundinput (2) which is filtered with one set of two filters simulating anHTF (FIG. 25). Typically, the sound inputs (1) which are combined aresound inputs belonging together in spatial groups, such as "from thefront", "from behind", "from the right side", "from the left side",etc., in relation to the listener.

An important use of the binaural synthesis method of the invention isfor simulation of a sound field of a specific environment, such as aroom, e.g. a concert hall, wherein transmission of sound from a set ofsound sources with specific positions in said environment to a receivingpoint with a specific position in said environment is simulated by

a) forming, for each of a number of transmission paths for each soundsource, a binaural signal (A), and

b) combining the binaural signals (A) for each sound source into abinaural signal (B), and

c) combining the binaural signals (B) of the set of sound sources into aresulting binaural signal (C).

Another important utilization of the invention is for noise measurementand/or assessment of the effect of noise, or any other measurementand/or simulation where a description of a sound transmission isinvolved, in which binaural signals produced according as discussedherein and/or HTFs as characterized herein are utilized to increase thegenerality.

For some uses of the invention, including, e.g., virtual realityapplications or teleconferencing, it is useful to sense position and/ororientation, and/or changes in position and/or orientation, of the headof a listener and modify the electronic signal processing in dependenceof the sensed position and/or orientation and/or changes in positionand/or orientation. This could, e.g., be used to give the impressionthat the virtual sources remain in position irrespective of headmovements.

The sensing of the position and/or orientation, and/or changes inposition and/or orientation, of the head of a listener, may be performedby

a) transmitting at least one pulse of energy, such as an ultrasonic wavepulse or an infrared light pulse, adapted to be received by one or morereceiving means mounted at and following the movements of the head ofthe listener,

b) detecting the arrival time or each of the arrival times of thetransmitted energy pulse or pulses at the receiving means or each of thereceiving means and optionally detecting or recording the time oftransmission or each of the times of transmission from the correspondingtransmitter or transmitters, and

c) calculating the position and/or orientation of the head of thelistener based on the detected arrival time or times and optionally onthe detected or recorded time or times of transmissions.

The signal processing in the method of the invention can, if desired,additionally include compensation of transfer characteristics of asignal-to-sound transducer, such as its frequency dependent sensitivity,impedance relations, etc., thereby approaching the perception of anideal signal-to-sound transducer. Further, the characteristics of thetransmission of sound from the signal-to-sound transducer to a specificpoint, e.g. to a specific point in the ear canal of a listener, could beincluded in the compensation. On the other hand, many soundreproductions which are perceived as pleasant or interesting do in factinclude transfer characteristics or coloration of loudspeakers, or soundmodifications characteristic of the room in which the loudspeakers arearranged, and thus, another interesting possibility is to supplement thebinaural signal with echoes and/or reverberation and/or coloration tosimulate a non-uniform signal response of the virtual signal-to-soundtransducers and/or to simulate that the virtual signal-to-soundtransducers are arranged in an imaginary room. These additional signalsmay or may not be coded with directional and/or distance informationabout their virtual sound sources.

As indicated above, the signal processing may additionally includecompensation for the difference in pressure division at the input to theear canal when the ear is occluded, respectively unoccluded, by aheadphone. A way of obtaining a description of the difference inpressure division at the input to the ear canal when the ear isoccluded, respectively unoccluded, by a headphone, comprises measuringthe transmission from the headphone to the sound pressure

at the entrance, or close to the entrance, of the blocked ear canal, and

at the entrance, or close to the entrance, of the open ear canal,

the ratio of the frequency domain descriptions of these transmissionsbeing obtained as characteristic of the pressure division (X) in thissituation,

and

measuring the transmission from a sound source that does not influencethe acoustic radiation impedance of the ear, to the sound pressure

at the entrance, or close to the entrance, of the blocked ear canal, and

at the entrance, or close to the entrance, of the open ear canal,

the ratio of the frequency domain descriptions of these transmissionsbeing obtained as characteristic of the pressure division (Y) in thissituation,

and obtaining the ratio X/Y which constitutes the frequency domaindescription of the difference in pressure division.

Any compensation for signal-to-sound transducers such as headphones andloudspeakers may be adapted to the individual listener, by determiningthe appropriate transfer characteristics for the individual user.

The signals subjected to the signal processing described above could besignals which are adapted to be decoded into sound representing signals,e.g. broadcast signals, by decoding them in the manner corresponding tothe coding scheme of the appropriate sound reproducing system and thenprocessing them into a binaural signal as described above. Whether ornot a particular broadcast signal is adapted to be decoded in aparticular system can easily be assessed by providing the signal to adecoder pertaining to the system and analyse the decoded signals.

Headphones constitute preferred signal-to-sound transducers for thebinaural signal. In the present context, the term headphones includesconventional headphones and any other sets of two portablesignal-to-sound transducer units adapted to be placed on a humanadjacent or close to the ears of the human.

Especially attractive headphones for use in the method of the inventioncould be wireless headphones adapted for any kind of wirelesstransmission of the binaural signal, such as electromagnetic, optical,infrared, ultrasonic, etc.

The binaural signal is normally adapted to be emitted by means ofheadphones, but it is within the scope of the invention to reproduce thesignal by means of two loudspeakers. When loudspeakers are used,crosstalk of the loudspeakers may, if desired, be counteracted bysupplementing the binaural signal with artificial crosstalk, which mayeither be incorporated in the binaural signal or consist of additionalelectrical signals. Crosstalk is caused by the fact that the left ear isable to hear the right loudspeaker and vice-versa in contrast to theheadphones.

When two loudspeakers are used to reproduce the sound corresponding tothe binaural signal the position of the listener in relation to theseloudspeakers is rather critical because of the cross-talk phenomena.However, by sensing the position of the head of the listener andmodifying the electronic signal processing in response to the sensing,it will be possible to compensate the cross-talk in accordance with theposition of the head of the listener, thereby dramatically improving thequality of the listening experience.

Both in the cases where headphones are used and in the cases where twoloudspeakers are used, the position and/or orientation, and/or changesin position and/or orientation, of the head of a listener can, asindicated above, be sensed by means of suitable sensing means, and theelectronic signal processing can be modified in dependence of the sensedposition and/or orientation and/or changes in position and/ororientation. The effects aimed at in the modification may range fromminor corrections or adjustments which are desirable in connection withhead movements when listening to binaural sound reproduction, tomodifications adapted to impart to the listener the perception that thevirtual sound sources remain in position irrespective of the positionand/or orientation, and/or changes in position and/or orientation, ofthe listener's head, or even modifications where special artificialeffects are aimed at, such as a perception that the virtual spatialsound field continues to turn a little due to "inertia" after thelistener has stopped a turn of the head. As will be understood by aperson skilled in the art, such modifications of the electronicprocessing are possible in particular where the HTFs are implemented bydigital filters, such as is described in detail in the following.

One way of sensing the parameters of the position and orientation of thelistener mentioned above is to apply a known varying magnetic field tothe surroundings of the listener and applying a set of crossing coils tothe head of the listener. When the magnetic field applied to thelistening room is known it is possible to derive the position andorientation of the listener's head from the voltages generated in thecrossing sensing coils. Analogous methods could be used for other kindsof fields, such as ultrasonic fields, applied to the listening room,with appropriate detectors applied to the listener's head, or equipmentbased on video cameras coupled to image recognition means could beutilized.

Other aspects of the invention relate to applications of the HTFs usedfor binaural synthesis utilizing the generality aspect of these HTFs forexample in designing artificial heads, in designing frequency responseof headphones, in computer models of the human binaural soundlocalization or perception in general, etc.

In accordance with what is discussed above, an embodiment of theinvention comprises transmitting the binaural signals in the form ofmodulated ultrasonic waves, the waves being received by a listenerequipped with two receiving means each of which is mounted close to theappertaining ear of the listener, changes in orientation of thelistener's head relative to a reference orientation being compensated onthe basis of the difference of the travel time of the ultrasonic wavepulses between the two receiving means so that the listener willperceive that virtual sound sources remain in a reference positionirrespective of the orientation of the listener's head, the compensationbeing automatic or carried out by involving electronic signalprocessing.

For a number of practical uses, such as in air traffic control, incontrol of cabs or trucks, in messenger offices, in life savingstations, in central offices of watchmen, in telephone meetings, inmeetings using audio-visual communication means, etc., the method of thepresent invention can be applied for communication, comprisingtransforming, by signal processing means,

signals (A₁ . . . A_(n)) of a t least one single channel communicationsystem and/or at least one multichannel communication system whichsignals are adapted for being supplied to at least one signal-to-soundtransducer, or

signals which are adapted for being decoded into such signals (A₁ . . .A_(n))

into a binaural signal (C), so that the binaural signal, whenreproduced, is capable of imparting to a receiver of the communication aperception of listening to a spatial sound field with a set of nindividually positioned virtual sound sources, each of which transmitsone of the signals (A₁ . . . A_(n)).

In connection with this, a valuable embodiment is where the position andorientation of the receiver's head is monitored, and head position andhead orientation data obtained in the monitoring is used to enable thereceiver to selectively transmit a message to one of the transmitterscorresponding to one of the signals (A₁ . . . A_(n)) by turning his headin the direction of the virtual sound source corresponding to saidtransmitter.

A special utilization of the method of the invention is for multichannelsound reproduction, e.g., Dolby Surround, Stereo, Quadrophony, or anyHDTV multichannel specification, comprising transforming, by signalprocessing means,

signals (A₁ . . . A_(n)) of a multichannel sound reproducing systemwhich signals are adapted for being supplied to n differentsignal-to-sound transducers of the multichannel sound reproducingsystem, or

signals which are adapted for being decoded into such signal s (A₁ . . .A_(n))

into a binaural signal (C) by the method of the invention so that thebinaural signal, when reproduced, is capable of imparting to a listenera perception of listening to a spatial sound field similar to the soundfield which would have resulted from listening to the n signal-to-soundtransducers spatially arranged in a room.

A range of uses of the method of the invention are related to thesituations where the binaural signals are used for positioning a set ofsounds at specific virtual positions in relation to an operator, suchas, e.g., operators of industrial processes, pilots and astronauts,fight controllers, video game players, users of interactive TV, surgeonsoperating patients, etc.

One example of this is where a moving virtual sound source with acharacteristic sound moves continuously or discontinuously betweenspecific positions of a set of virtual sound sources, the operator beingenabled to communicate a specific message to the system according to aparticular virtual sound source by prompting the system when the movingvirtual sound source is positioned substantially at the position of saidvirtual sound source. The position of the moving virtual sound sourcemay be controlled by the operator, and/or by the orientation and/orposition of the head of the operator, and/or the positions may bedynamically controlled by a computer in accordance with a set of rulesor a predefined scheme.

One application hereof is in guidance of the movement of an object, suchas a robot, or a person, such as a blind person, where the method isused for controlling or assisting the movement and/or position of anobject and/or a living being by dynamically positioning a virtual soundsource in relation to the object and/or living being, so as to guide theobject and/or the living being in relation to the position of thevirtual sound source.

In any embodiment of the invention, the binaural signal may, of course,be stored on an audio storage medium or broadcast. As a special feature,each sound input (2) representing a combination of more than one soundinputs (1) may be stored or broadcast separately, such as in a separatetrack or in a separate channel, respectively, the binaural filteringbeing carried out before or after storing or broadcasting.

A number of aspects of the invention comprise the use of HTFs of thegenerality obtained according to the present invention in computermodelling or analysing the cerebral human binaural sound localizationability.

Another such aspect comprises a method for designing headphones, whereinadapting the transfer characteristics of the headphones are adapted toresemble an HTF characterized according to the invention for a givendirection, e.g., the frontal direction, or to resemble weighted averagesof such HTFs corresponding to averages of given directions.

A further such aspect relates to an artificial head having HTFs whichcorrespond substantially to HTFs determined according the invention forall angles of sound incidence, or at least for angles of sound incidencewhich constitute part of the total sphere surrounding the artificialhead, such as the upper hemisphere or the frontal region. This can bedone by adapting the geometric characteristics of the artificial headand/or the acoustic properties of the materials used so as toapproximate the HTFs of the artificial head to HTFs according to theinvention for all angles of sound incidence, or at least for angles ofsound incidence which constitute part of the total sphere surroundingthe artificial head, such as the upper hemisphere or the frontal region.

In the following, the invention will be described in more detail, by wayof example, with reference to the accompanying drawings, in which:

FIGS. 1 (1)-(6) shows the time domain description of a set of HTFs (1)of a specific person according to the invention, and (7)-(12) shows thefrequency domain description of the HTFs (1),

FIGS. 2 (1)-(6) shows the time domain description of a set of HTFs (2)according to the invention, obtained as an average across HTFs for 40persons, by averaging the minimum phase approximation in decibelsfrequency by frequency, followed by the addition of the average linearphase parts of the HTFs and, (7)-(12) shows the frequency domaindescription of the HTFs (2),

FIGS. 3 (1)-(6) shows the time domain description of a set of HTFs (3)according to the invention, obtained as an average across 40 persons, byaveraging the time aligned time domain representations of the HTFssample by sample, followed by the addition of the average delays of theHTFs, and (7)-(12) shows the frequency domain description of the HTFs(3),

FIG. 4 is a photo of a miniature microphone mounted in the ear of a testperson to measure the pressure (p₂) at the blocked ear canal,

FIG. 5 shows the placement of a microphone at the blocked entrance to anear canal,

FIG. 6 is a photo of the measurement set-up in anechoic chamber formeasurement of an HTF,

FIG. 7 shows graphs of the frequency domain representation and the timedomain representation of a specific HTF for one test person,

FIG. 8 shows the standard deviation of the gain of HTFs for differentgroups of test persons for comparison of measurements performedaccording to the present invention with measurements performed accordingto prior art,

FIG. 9 shows an example of a Head-related Impulse Response,

FIG. 10 shows the frequency domain representation of the Head-relatedImpulse Response of FIG. 9 truncated to different lengths,

FIG. 11 shows an example of a Head-related Impulse Response adjusted fordifferent DC values,

FIG. 12 as FIG. 11 but for the frequency domain representations,

FIG. 13 shows an example of averaging the time domain representations ofa set of HTFs,

FIG. 14 as FIG. 13, but for the frequency domain representations,

FIG. 15 shows an example of logarithmic averaging the frequency domainrepresentations of a set of HTFs,

FIG. 16 shows an example of a minimum phase representation and anexample of a zero phase representation of an averaged set ofHead-related Impulse Responses,

FIG. 17 shows an example of averaging the time domain representations ofa set of HTFs after time alignment,

FIG. 18 as FIG. 17, but for the frequency domain representations of theHTFs,

FIG. 19 shows an example of interpolation of the time domainrepresentations of the HTFs to create a new HTF corresponding to adirection that is in between four directions corresponding to four knownHTFs,

FIG. 20 as FIG. 19, but for the frequency domain representations,

FIGS. 21 (a)-(d) shows an example of obtaining an approximate HTF for ashort distance between the listener and the sound source,

FIGS. 22, 23, 24 show standard deviations of the amplitude, in dB,

FIG. 25 is a schematic diagram showing two sound inputs combined into asingle sound input that is filtered by one set of two filtersrespectively simulating left and right HTFs; and

FIGS. 26 and 27 are schematic diagrams showing a sound input that isfiltered by two and three of two filters, respectively, wherein each setof two filters respectively simulates left and right HTFs.

FIGS. 1-3 show three different sets of HTFs obtained by differentmethods according to the present invention, one in each figure. In eachthe figures, the descriptions of the HTFs are characterized by theirangle of incidence, stated as (azimuth, elevation). In each of timedomain descriptions, the upper curve pertains to the left ear, and thelower curve pertains to the right ear. In each of the frequency domaindescriptions, the thick line curve pertains to the left ear, and thethin curve pertains to the right ear. The "tag" at each side of thefrequency domain curves represents 0 dB.

The HTFs shown in FIGS. 1-3 are examples of HTFs according to thecurrent invention, the HTFs of FIG. 1 being a single person's HTFs,whereas the HTFs of FIG. 3 and FIG. 2 are averages across a large numberof persons, and have been obtained according aspects of invention. Theaverage HTFs of FIG. 2 has been obtained as an average across HTFs for40 persons, by averaging the minimum phase approximation in decibelsfrequency by frequency, followed by the addition of the average linearphase parts of the HTFs. The HTFs of FIG. 3 has been obtained as anaverage across 40 persons, by averaging the time aligned time domainrepresentations of the HTFs sample by sample, followed by the additionof the average delays of the HTFs.

FIG. 6 shows a set-up for a measurement of the HTFs according to thepresent invention performed in an anechoic chamber. A known signal issent to a loudspeaker positioned in the direction corresponding to theHTF to be measured. A miniature microphone of the type Sennheiser KE4-211-2 is placed at each of the blocked entrances to the ear canals ofthe test person as shown in FIG. 4 and FIG. 5.

The KE 4-211-2 is a pressure microphone of the back electret type, andit has a built-in FET amplifier. The microphone itself has a sensitivityof approximately 10 mV/Pa Coupled with a gain as suggested in the datasheet, the sensitivity increases to approximately 35 mV/Pa. A smallbattery box was used, and in order to increase the output signal and toreduce the output impedance, a 20 dB amplifier was built into the samebox. Two selected microphones were used throughout the experiment, onefor each ear.

The reference sound pressure p₁ from the loudspeaker was measured witheach of the miniature microphones. The microphone was placed at theposition where the middle of the test person's head would be duringmeasurement. In order to disturb the field as little as possible, themicrophones were fixed by a thin wire and with an orientation giving 90°incidence of the soundwave from the loudspeaker. In this way, the p₁measurement was minimally influenced by the presence of the microphonein the sound field.

During measurement of the sound pressure p₂ at the entrance to theblocked ear canal, the microphone was mounted in an EAR earplug placedin the ear canal. The microphone was inserted in a hole in the earplug,and then the soft material of the earplug was compressed duringinsertion in the ear canal. As the earplug relaxed, the outer end of theear canal was completely filled out. The end of the earplug and themicrophone were mounted flush with the ear canal entrance (see FIG. 4and FIG. 5).

The measurements were carried out in an anechoic chamber with a freespace between the wedges of 6.2 m (length) by 5.0 m (width) by 5.8 m(height). The test person was standing on a platform in a naturalupright position, and a small backrest mounted on the platform helpedthe test person to stand still.

To assist in the control of horizontal position and orientation of thetest persons head, the test person had a paper marker on top of thehead. This marker was observed through a video camera placed right infront of the test person and shown on a moveable monitor to the testperson. Using this, the test person could correct position and azimuth.

The operators had a similar monitoring for observation of the testpersons exact position and for controlling that the test person did notmove during each single measurement. If movements were observed, themeasurement was discarded and redone.

The loudspeakers used were 7 cm membrane diameter midrange unit (VifaM10MD-39) mounted in 15.5 cm diameter hard plastic balls.

The general purpose measuring system known as MLSSA (Maximum LengthSequence System Analyzer) was used. Maximum length sequences are binarytwo level pseudo-random sequences. The basic idea of MLS technique is toapply an analogue version of the sequence to the linear system undertest, sample the resulting response, and then determine the systemimpulse response by cross-correlation of the sampled response with theoriginal sequence.

The above method of performing measurements using maximum lengthsequences offers a number of advantages compared to traditionalfrequency and time domain techniques. The method is basically noiseimmune, and combined with averaging, the achieved signal to noise ratiois high. A thorough review of the MLS method is given by Rife andVanderkooy: "Transfer-function measurement with maximum-lengthsequences", Journal of the Audio Engineering Society, vol. 37, no. 6.

For the purpose of measuring at both ears simultaneously, two MLSSAsystems were used, coupled in a master-slave configuration by a purposemade synchronization unit allowing sample synchronous measurements.

The 4 V peak-to-peak stimulus signal from the master MLSSA board wassent to the power amplifier (Pioneer A-616) that was modified to have acalibrated gain of 0.0 dB. From the output it was directed through aswitch-box to the loudspeaker in the measurement direction. The freefield sound had a level of 75 dB(A) at the test persons position, alevel where the stapedius was assumed to be relaxed.

From the microphone the signal was sent through a measuring amplifier,B&K 2607.

The sampling frequency of 48 kHz was provided by an external clock. Toavoid frequency aliasing, the 20 kHz Chebyshev low pass filter of theMLSSA board and the 22.5 kHz low pass filter of the measuring amplifierwere used. Also the 22.5 Hz high pass filter on the measuring amplifierwas active.

Preliminary measurements on the free field setup using the maximum MLSlength offered by MLSSA, 65535 points, showed that a length of 4095points was sufficient to avoid time aliasing. In order to achieve a highsignal to noise ratio, the recording was averaged 16 times, calledpre-averaging in the MLSSA system. Even with this averaging the totaltime for a measurement was as short as 1.45 seconds. During this periodthe test persons were normally able to stand still. All measured impulseresponses were very short, and only the first 768 samples of eachimpulse response, corresponding to 16 milliseconds, were computed andsaved.

Results of the measurements were impulse responses for the transmissionfrom input to the power amplifier to output of the measuring amplifier.The post processing needed to obtain the wanted information was carriedout in MATLAB.

The measured impulse responses all included an initial delay,corresponding to the propagation time from the loudspeaker to themeasuring point (approximately 6 milliseconds). All responses were veryshort, duration only a few milliseconds. therefore, only samples from256 through 511 were processed (time from 5.33 ms to 10.65 ms). Therestriction to this time window eliminated reflections from the monitorin the anechoic chamber.

For determination of the HTF (P₂ /P₁) the selected portion of the p₁ andp₂ impulse responses were Fourier transformed, and a complex divisionwas carried out in the frequency domain. As the same equipment wasinvolved during measurement of p₁ and p₂, the influence of equipmentcancels out in the division.

If it is desirable to simulate the HTF using analog filters, then thefrequency domain representation of the HTF can form the basis for thesynthesis of analog implementations of the filters as described in anytext book on filter synthesis.

The impulse response of the HTF was determined through an inverseFourier transform of P₂ /P₁. Before the transformation, P₂ /P₁ wasfiltered by a 4'th order Butterworth filter (bilinearly transformed) inorder to prevent from frequency aliasing.

If its desirable to simul ate the HTF using digital technique, then theHead-related Impulse Responses can be digitised and stored in thestorage(s) of the digital implementations of the filters.

An example of the frequency domain representation and the time domainrepresentation of a specific HTF for one test person is shown in FIG. 7.To benefit from these advantageous HTFs it is important to understandthat the signal to sound transducer, such as headphones, has to becalibrated correctly.

As already mentioned the entrance to the blocked ear canal has beenchosen as the measurement point because the individual differencesbetween HTFs of different test persons have been found to be very lowamong other things because of this choice. It has been shown that amajor part of the differences between individual HTFs are added by thetransmission of the sound pressures through the individual ear canals.Thus, it is important to be able to reproduce the sound pressures, e.g.by headphones, at the reference point of the measurement at the entranceto t he blocked ear canal without adding any individual differences tothe sound pressures. This means that the transfer function describingthe characteristics of transmission of a sound signal from the terminalsof the headphones to the reference point at the blocked ear canal musthave a flat frequency response so that the frequency domainrepresentations of the HTFs will not be distorted.

Further, the headphone must be open, as defined in the above mentionedtutorial by Henrik Miller, or which is equivalent to having a free fieldequivalent coupling to the ear as it has later been denoted, so that theimpedance looked out into from the ear is not changed when the headphoneis applied to the ear, or alternatively the headphones should beadjusted to compensate for its transmission impedance.

FIG. 8 shows the standard deviation of the gain of HTFs for differentgroups of test persons for comparison of measurements performedaccording to the present invention with measurement performed accordingto prior art. The graphs of FIG. 8 are based on measurements of the HTFsof a significant number of test persons. The prior art measurements aredisclosed in: F. L. Wightman and D. Kistler, "Headphone Simulation ofFree-Field Listening, I: Stimulus Synthesis, II: PsychoacousticalValidation," J. Acoust. Soc. Am. 85(2), 858-878, 1989 and in: P. A.Hellstrom and A. Axelsson, "Miniature microphone probe tube measurementsin the external auditory canal", J. Acoust. Soc. Am. 93(2), 907-919,1993. The graphs show the standard deviation of the gain as a functionof frequency averaged for all directions in 1/3 octave bands. It is seenthat the present invention provides an improvement by approximately afactor of 2 over the known methods, and thereby provides a significantimprovement compared to prior art techniques.

FIG. 9 shows a typical example of a Head-related Impulse Response.Different lengths of this impulse response (starting from t=0 in FIG. 9)are Fourier transformed and the results are shown in FIG. 10. The DCadjustments described below are performed before each Fouriertransformation after truncation of the impulse response. It is seen fromFIG. 10 that no significant changes in the frequency domainrepresentation of the impulse response occur for impulses longer than 1ms. As explained earlier, when evaluating the duration of the part ofthe Head-related Impulse Responses used in the simulation, it isimportant to study its frequency response. Examples are reported wherean apparently short impulse can not be truncated to a few millisecondsas the truncation changes its frequency response to an unacceptableextent because the impulse contain essential information over a longertime duration. FIGS. 9 and 10 illustrate that this is not true for theimpulses of the present invention.

As mentioned before, until the present invention, the value at zero Hzof the frequency domain representation of the HTF (the DC value of theHTF) seems to have attracted little or no attention in the art. However,the research and development of the present inventors have revealed thatthe DC value has a significant influence on the frequency domainrepresentation of the HTF thereby influencing the sound quality, such ascoloration, when the HTF is used in sound reproduction. FIG. 11 shows anexample of a Head-related Impulse Response adjusted for different DCvalues and FIG. 12 shows the corresponding frequency domainrepresentations. It is interesting to note that the influence on thetime domain representations of the HTFs are barely seen whilesimultaneously the influence in the frequency domain representations aresignificant.

FIG. 13 shows the time domain representations of the HTFs of a specificdirection for one ear for a group of test persons and also the averagevalue of these HTFs is shown (in this context the term averaging meansthe averaging of any function of the pressures measured, such as thepressure itself or the logarithmic pressure, or p² (the power average),etc.).

FIG. 14 shows the gain of the corresponding frequency domainrepresentations of the HTFs of FIG. 13 and also the average gain isindicated.

FIG. 15 shows the gain of the HTFs shown in FIG. 14 but with thelogarithmic average also shown. It will be noted that the logarithmicaverage seems to represent the group of HTFs better than the averageshown in FIG. 14.

In FIG. 14 and FIG. 15 only the gain is averaged which leaves the phaseto be defined. Several possibilities exist. FIG. 16 shows the timedomain representation of the averaged HTFs with the minimum phase addedand also the corresponding average with a zero phase is shown.

FIG. 17 and FIG. 18 shows the time domain representations and thefrequency domain representations of the HTFs of a specific direction forone ear for a group of test persons and also the average value of theseHTFs is shown but after time alignment. The time alignment beingperformed, as the name indicates, in the time domain, e.g., by alignmentto the onset of the pulses or alignment to the first peak, or alignmentto maximum cross-correlation. In FIG. 17 and FIG. 18 the impulses arealigned to the onset of the impulses. It will be seen that the averagesprovided this way seem to reproduce more features of the HTFs than theaverages without the time alignment.

The time alignment can be performed for the transfer functions of bothears together or independently for the transfer functions of each ear.

After time alignment and averaging a linear phase is added to theaveraged functions to account for the interaural time difference. Thelinear phase contribution to the function is calculated on the basis ofthe measured appertaining HTFs, such as the average of the linear phasecontributions of all the HTFs.

Yet another way of averaging the HTFs of a specific direction is toperform a sort of a parametric averaging by aligning the time domainrepresentations according to significant features, e.g. aligning peaksand valleys of the HTFs either in the time domain or in the frequencydomain including stretching or compressing the x-axis (time orfrequency) in between peaks and valleys, followed by an averaging of theresulting functions and followed by the addition of the calculated, e.g.averaged phase contribution.

In many applications, e.g. in virtual reality applications, it isdesirable to be able to simulate a huge number of HTFs. According to theinvention it is possible to simulate HTFs from a set of specific HTFsusing interpolation.

For example an HTF corresponding to a specific direction that lies inbetween the directions corresponding to four known HTFs could becalculated according to any of the calculation methods described abovein the sections concerning averaging techniques. FIG. 19 and FIG. 20shows examples of this in the time domain and in the frequency domain.

In FIG. 22, FIG. 23 and FIG. 24 Group I angles designate angles abovehorizontal plane and at the same side as the ear (including thehorizontal plane and the median), and Group II angles designate theremaining angles.

We claim:
 1. A method of generating binaural signals by filtering atleast one sound input with at least one set of two filters, each set oftwo filters having been designed so that the two filters simulate theleft ear and the right ear parts of a Head-related Transfer Function(HTF), the method having at least one of the following features (a),(b), and (c):(a) the HTF is used generally for a population of humansfor which the binaural signals are intended, the HTF being determined insuch a manner that the standard deviation of the amplitude, in dB,between subjects is less than a limit selected from the group consistingof limit (i), limit (ii), limit (iii), and limit (iv), wherein:limit (i)is at the most about 1.4 dB between 100 Hz and 1 kHz, and is at the mostabout 1.4 dB at 1 kHz, linearly increasing, on a logarithmic frequencyaxis, to about 3.2 dB at 4 kHz, and is at the most about 3.2 dB at 4kHz, linearly increasing, on a logarithmic frequency axis, to about 6.0dB at 8 kHz over at least a major part of the frequency interval between1 kHz and 8 kHz, when determined with pure tones for first angles on andabove the horizontal plane of the ears of said humans and on the sameside of the ears of said humans; limit (ii) is at the most about 1.4 dBbetween 100 Hz and 1 kHz, and is at the most about 1.4 dB at 1 kHz,linearly increasing, on a logarithmic frequency axis, to about 2.75 dBat 4 kHz, and is at the most about 2.75 dB at 4 kHz, linearlyincreasing, on a logarithmic frequency axis, to about 4.5 dB at 8 kHzover at least a major part of the frequency interval between 1 kHz and 8kHz, when determined with 1/3 octave noise bands for first angles on andabove the horizontal plane of the ears of said humans and on the sameside of the ears of said humans; limit (iii) is at the most about 1.5 dBbetween 100 Hz and 1 kHz, and is at the most about 1.5 dB at 1 kHz,linearly increasing, on a logarithmic frequency axis, to about 4.0 dB at4 kHz, and is at the most about 4.0 dB at 4 kHz, linearly increasing, ona logarithmic frequency axis, to about 8.5 dB at 8 kHz over at least amajor part of the frequency interval between 1 kHz and 8 kHz, whendetermined with pure tones for all angles other than said first angles;and limit (iv) is at the most about 1.5 dB between 100 Hz and 1 kHz, andis at the most about 1.5 dB at 1 kHz, linearly increasing, on alogarithmic frequency axis, to about 3.0 dB at 4 kHz, and is at the mostabout 3.0 dB at 4 kHz, linearly increasing, on a logarithmic frequencyaxis, to about 5.5 dB at 8 kHz over at least a major part of thefrequency interval between 1 kHz and 8 kHz, when determined with 1/3octave noise bands for all angles other than said first angles; (b) theduration of the time domain representation of the transfer function ofthe filter simulating the HTF is at the most 2 msec; and (c) the valueat zero Hertz of the frequency domain description of the transferfunction of the filters simulating the HTF is in the range from 0.316 to3.16.
 2. The method according to claim 1, wherein the HTF has beendetermined in such a manner that the standard deviation of theamplitude, in dB, between subjects is less than a limit selected fromthe group consisting of limit (v), limit (vi), limit (vii), and limit(vii), wherein:limit (v) is at the most about 1.0 dB between 100 Hz and1 kHz, and is at the most about 1.0 dB at 1 kHz, linearly increasing, ona logarithmic frequency axis, to about 2.5 dB at 4 kHz, and is at themost about 2.5 dB at 4 kHz, linearly increasing, on a logarithmicfrequency axis, to about 5.0 dB at 8 kHz over at least a major part ofthe frequency interval between 1 kHz and 8 kHz, when determined withpure tones for first angles on and above the horizontal plane of theears of said humans and on the same side of the ears of said humans;limit (vi) is at the most about 1.0 dB between 100 Hz and 1 kHz, and isat the most about 1.0 dB at 1 kHz, linearly increasing, on a logarithmicfrequency axis, to about 2.25 dB at 4 kHz, and is at the most about 2.25dB at 4 kHz, linearly increasing, on a logarithmic frequency axis, toabout 3.0 dB at 8 kHz over at least a major part of the frequencyinterval between 1 kHz and 8 kHz, when determined with 1/3 octave noisebands for first angles on and above the horizontal plane of the ears ofsaid humans and on the same side of the ears of said humans; limit (vii)is at the most about 1.25 dB between 100 Hz and 1 kHz, and is at themost about 1.25 dB at 1 kHz, linearly increasing, on a logarithmicfrequency axis, to about 3.0 dB at 4 kHz, and is at the most about 3.0dB at 4 kHz linearly increasing, on a logarithmic frequency axis, toabout 7.0 dB at 8 kHz over at least a major part of the frequencyinterval between 1 kHz and 8 kHz, when determined with pure tones forall angles other than said first angles; and limit (viii) is at the mostabout 1.1 dB between 100 Hz and 1 kHz, and is at the most about 1.1 dBat 1 kHz, linearly increasing, on a logarithmic frequency axis, to about2.5 dB at 4 kHz, and is at the most about 2.5 dB at 4 kHz, linearlyincreasing, on a logarithmic frequency axis, to about 4.5 dB at 8 kHzover at least a major part of the frequency interval between 1 kHz and 8kHz, when determined with 1/3 octave noise bands for angles other thansaid first angles.
 3. The method according to claim 2, wherein the HTFhas been determined in such a manner that the standard deviation of theamplitude, in dB, between subjects is less than a limit selected fromthe group consisting of limit (ix), limit (x), limit (xi), and limit(xii), wherein:limit (ix) is at the most about 0.8 dB between 100 Hz and1 kHz, and is at the most about 0.8 dB at 1 kHz, linearly increasing, ona logarithmic frequency axis, to about 2.0 dB at 4 kHz, and is at themost about 2.0 dB at 4 kHz, linearly increasing, on a logarithmicfrequency axis, to about 4.0 dB at 8 kHz over at least a major part ofthe frequency interval between 1 kHz and 8 kHz, when determined withpure tones for first angles on and above the horizontal plane of theears of said humans and on the same side of the ears of said humans;limit (x) is at the most about 0.8 dB between 100 Hz and 1 kHz, and isat the most about 0.8 dB at 1 kHz, linearly increasing, on a logarithmicfrequency axis, to about 1.6 dB at 4 kHz, and is at the most about 1.6dB at 4 kHz, linearly increasing, on a logarithmic frequency axis, toabout 2.75 dB at 8 kHz over at least a major part of the frequencyinterval between 1 kHz and 8 kHz, when determined with 1/3 octave noisebands for first angles on and above the horizontal plane of the ears ofsaid humans and on the same side of the ears of said humans; limit (xi)is at the most about 1.0 dB between 100 Hz and 1 kHz, and is at the mostabout 1.0 dB at 1 kHz, linearly increasing, on a logarithmic frequencyaxis, to about 2.5 dB at 4 kHz, and is at the most about 2.5 dB at 4kHz, linearly increasing, on a logarithmic frequency axis, to about 6.2dB at 8 kHz over at least a major part of the frequency interval between1 kHz and 8 kHz, when determined with pure tones for all angles otherthan said first angles; and limit (xii) is at the most about 0.9 dBbetween 100 Hz and 1 kHz, and is at the most about 0.9 dB at 1 kHz,linearly increasing, on a logarithmic frequency axis, to about 2.0 dB at4 kHz, and is at the most about 2.0 dB at 4 kHz, linearly increasing, ona logarithmic frequency axis, to about 3.5 dB at 8 kHz over at least amajor part of the frequency interval between 1 kHz and 8 kHz, whendetermined with 1/3 octave noise bands for angles other than said firstangles.
 4. The method according to claim 1, wherein the duration of thetime domain representation of the transfer function of the filterssimulating the HTF is at the most 1.5 msec.
 5. The method according toclaim 4, wherein the duration of the time domain representation of thetransfer function of the filters simulating the HTF is at the most 1.2msec.
 6. The method according to claim 5, wherein the duration of thetime domain representation of the transfer function of the filterssimulating the HTF is at the most 1 msec.
 7. The method according toclaim 6, wherein the duration of the time domain representation of thetransfer function of the filters simulating the HTF is at the most 0.9msec.
 8. The method according to claim 7, wherein the duration of thetime domain representation of the transfer function of the filterssimulating the HTF is at the most 0.75 msec.
 9. The method according toclaim 8, wherein the duration of the time domain representation of thetransfer function of the filters simulating the HTF is at the most 0.5msec.
 10. The method according to claim 1, wherein the value at zeroHertz of the frequency domain description of the transfer function ofthe filters simulating the HTF is in the range from 0.5 to
 2. 11. Themethod according to claim 10, wherein the value at zero Hertz of thefrequency domain description of the transfer function of the filterssimulating the HTF is in the range from 0.7 to 1.4.
 12. The methodaccording to claim 11, wherein the value at zero Hertz of the frequencydomain description of the transfer function of the filters simulatingthe HTF is in the range from 0.8 to 1.2.
 13. The method according toclaim 12, wherein the value at zero Hertz of the frequency domaindescription of the transfer function of the filters simulating the HTFis in the range from 0.9 to 1.1.
 14. The method according to claim 13,wherein the value at zero Hertz of the frequency domain description ofthe transfer function of the filters simulating the HTF is in the rangefrom 0.95 to 1.05.
 15. The method according to claim 1, wherein the HTFhas been determined using at least one of the following measures (A)through (I):(A) the sound pressure P2 from a spatially arranged soundsource, measured at a reference point at the entrance, or close to theentrance, of a blocked ear canal of a person or of an artificial head;(B) the sound pressure p₁ from a sound source, measured at a positionbetween the ears of the person or of the artificial head, with theperson or the artificial head absent; (C) the frequency domaindescription of the HTF has been calculated by dividing the frequencydomain description of p₂ by the frequency domain description of p₁ ; (D)the time domain description of the HTF has been obtained by inverseFourier transformation of the frequency domain description; (E) for aparticular direction in relation to the person or the artificial head,the left and right ear parts of the HTF have been measuredsimultaneously; (F) the person has been standing during the measurementof the HTF; (G) the person has been monitored by visual means to ensurethat the position of the head of the person was not changed during themeasurement of the HTF, and any measurement of an HTF during which theposition of the head of the person differed from the correct positionhas been discarded; (H) the person himself monitored the position of hishead in order to keep his head in the correct position duringmeasurement of the HTF; and (I) the measurements were carried out in ananechoic chamber, the measurement time for one HTF being at the mostabout 5 seconds.
 16. The method according to claim 15, wherein thereference point is at most 0.8 cm from the entrance to the blocked earcanal.
 17. The method according to claim 16, wherein the reference pointis at most 0.6 cm from the entrance to the blocked ear canal.
 18. Themethod according to claim 17, wherein the reference point is at most 0.3cm from the entrance to the blocked ear canal.
 19. The method accordingto claim 18, wherein the reference point is at the entrance to theblocked ear canal.
 20. The method according to claim 1, wherein the HTFhas been obtained from HTFs (B), defined as HTFs that have beendetermined for at least two test objects, a test object being a personor an artificial head, by selecting an HTF which, when used in binauralsynthesis, gives a sound impression which, when presented to a testpanel, is found to give a high degree of conformity with real lifelistening to a sound source in the direction in question.
 21. The methodaccording to claim 1, wherein the HTF has been obtained from HTFs(B),defined as HTFs that have been determined for at least two test objects,a test object being a person or an artificial head, by selecting an HTFwhich shows a high degree of similarity to individual HTFs of apopulation.
 22. The method according to claim 20, wherein the HTFsrelating to at least two angles of sound incidence have beenindividually selected among HTFs(B).
 23. The method according to claim1, wherein the HTF has been obtained from HTFs (B), defined as HTFs thathave been determined for at least two test objects, a test object beinga person or an artificial head, by averaging, in the frequency domain,the amplitude of the HTFs (B).
 24. The method according to claim 1,wherein the HTF has been obtained from HTFs (B), defined as HTFs thathave been determined for at least two test objects, a test object beinga person or an artificial head, by averaging in the time domain, thetime-aligned HTFs (B).
 25. The method according to claim 23, wherein atleast a portion of the frequency axis has been either compressed orexpanded individually for each HTF to reduce the differences between theHTFs before the averaging.
 26. The method according to claim 24, whereinat least a portion of the time axis has been either compressed orexpanded individually for each HTF to reduce the differences between theHTFs before the averaging.
 27. The method according to claim 1, whereinthe HTF has been obtained from HTFs (B), defined as HTFs that have beendetermined for at least two test objects, a test object being a personor an artificial head, by averaging characteristic parameters of theHTFs (B).
 28. The method according to claim 27, wherein thecharacteristic parameters are the frequency and the amplitude ofcharacteristic points when the HTFs (B) are described in the frequencydomain.
 29. The method according to claim 27, wherein the characteristicparameters are the time and the amplitude of characteristic points whenthe HTFs aredescribed in the time domain.
 30. The method according to27, wherein the characteristic parameters are the coordinates of polesand zeroes when the HTFs are described in the complex s- or z-domain.31. The method according to claim 1, wherein the HTF is an HTF (D),defined as an HTF that has been obtained from an HTF that has beenselected from the group consisting of the 97 HTFs shown in each of FIGS.1, 2, and
 3. 32. The method according to claim 31, wherein the HTF (D)has been produced by further signal processing of an HTF selected fromthe group consisting of the 97 HTFs shown in each of FIGS. 1, 2, and 3.33. The method according to claim 32, wherein the HTF, when used forbinaural synthesis, gives an audible impression that is not clearlydifferent from the impression given by an HTF (D), wherein the term"clearly different" means that a panel of inexperienced listenersobtains a score of at least 90 percent correct answers, when the HTF iscompared to an HTF (D) in a balanced, four-alternative-forced-choicetest, using program material for which the binaural signals are used, orfor which the binaural signals are intended to be used.
 34. The methodaccording to claim 33, wherein the term "clearly different" means thatthe panel of inexperienced listeners obtains a score of at least 80percent correct answers.
 35. The method according to claim 34, whereinthe term "clearly different" means that the panel of inexperiencedlisteners obtains a score of at least 70 percent correct answers. 36.The method according to claim 35, wherein the term "clearly different"means that the panel of inexperienced listeners obtains a score of atleast 50 percent correct answers.
 37. The method according to claim 1,wherein the HTF is adapted to at least one listener, comprising thefurther step of modifying the interaural time difference of the HTF, themodification being based on the physical dimension of the at least onelistener.
 38. The method according to claim 1, wherein the HTF isadapted to at least one listener, comprising the further step ofmodifying the interaural time difference of the HTF, the modificationbeing based on a psychoacoustic experiment, where the HTF is used forbinaural synthesis, and the interaural time difference is adjusted sothat the sound impression as perceived by the at least one listener isfound to give a high degree of conformity with real life listening to asound source in the direction intended.
 39. The method according toclaim 1, wherein the HTF has been obtained as an approximate HTF for anyspecific angle of sound incidence, by interpolating neighboring HTFs,the interpolation being carried out as a weighted average of neighboringHTFs.
 40. The method according to claim 39, wherein the averaging is anaveraging procedure wherein the HTF has been obtained from HTFs (B),defined as HTFs that have been determined for at least two test objects,a test object being a person or an artificial head, by averaging, in thefrequency domain, the amplitude of the HTFs (B).
 41. The methodaccording to claim 1, wherein the HTF has been obtained as anapproximate HTF on the basis of a nearby HTF (B), by performing anadjustment of the linear phase of the HTF (B) to obtain substantiallythe interaural time difference pertaining to the angle of incidence forwhich the approximate HTF is intended, wherein an HTF (B) is defined asan HTF that has been determined for at least two test objects, a testobject being a person or an artificial head.
 42. A method of obtainingan approximate short distance HTF for a short distance between alistener and a sound source for use in methods of generating binauralsignals, comprising the steps of:(1) determining (a) a left ear part HTFrepresenting the geometric angle from the source position to the leftear position, or, if the left ear is not visible from the sourceposition, the geometric angle from the source position tangentially tothe part of the head obscuring the left ear, and (b) a right ear partHTF representing the geometric angle from the source position to theright ear position, or, if the right ear is not visible from the sourceposition, the geometric angle from the source position tangentially tothe part of the head obscuring the right ear; and (2) combining the leftear part HTF with the right ear part HTF.
 43. The method according toclaim 42, further comprising the step of individually adjusting thelevels of the left ear part HTF and the right ear part HTF.
 44. Themethod according to claim 1, wherein the method is performed using anHTF produced by combining (a) the left ear part of an HTF representingthe geometric angle from the source position to the left ear position,or, if the left ear is not visible from the source position, thegeometric angle from the source position tangentially to the part of thehead obscuring the left ear, with (b) the right ear part of an HTFrepresenting the geometric angle from the source position to the rightear position, or, if the right ear is not visible from the sourceposition, the geometric angle from the source position tangentially tothe part of the head obscuring right ear.
 45. The method according toclaim 44, further comprising the step of individually adjusting thelevels of the left ear and the right ear parts of the HTF.
 46. A methodof generating binaural signals by filtering at least one sound inputwith one set of two filters, the set of two filters having been obtainedfrom an HTF asdefined in claim 1, by further processing which maintainsthe information contents inherent in the original HTF, the furtherprocessing of the left and right ear parts of the HTF beingsubstantially identical.
 47. A method of generating binaural signals byfiltering at least one sound input with at least two sets of twofilters, the sets of two filters having been obtained from HTFs asdefined in claim 1, by further processing that maintains the informationcontents inherent in the original set of HTFs, the said furtherprocessing being substantially identical for the various angles, but notnecessarily being substantially identical for the left and right earparts of the sets of HTFs.
 48. The method according to claim 46, furthercomprising the step of signal processing that has been performed so thatthe amplitude of a binaural signal formed by binaural synthesis of aparticular sound field is substantially identical to the amplitude ofthe particular sound field itself.
 49. The method according to claim 1,wherein at least two first sound inputs are combined into one secondsound input which is filtered with one set of two filters simulating anHTF.
 50. The method according to claim 49, wherein the first soundinputs are sound inputs belonging together in spatial groups in relationto the listener.
 51. The method according to claim 1, wherein thebinaural signals are supplemented with supplementing signalscorresponding to reflections.
 52. The method according to claim 1,wherein the at least one sound input is filtered with at least two setsof two filters, each set of two filters having been designed so that thetwo filters simulate the left ear and the right ear parts of an HTF. 53.The method according to claim 52, wherein the at least one sound inputis filtered with at least three sets of two filters, each set of twofilters having been designed so that the two filters simulate the leftear and the right ear parts of an HTF.
 54. The method according to claim1, wherein the binaural signals are used for simulation of a sound fieldof a specific environment, wherein transmission of sound from a set ofsound sources with specific positions in said environment to a receivingpoint with a specific position in said environment is simulated by:(i)forming, for each of a number of transmission paths for each soundsource, a first binaural signal; (ii) combining the first binauralsignals for each sound source into a second binaural signal; and (iii)combining the second binaural signals of the set of sound sources into aresulting third binaural signal.
 55. A method for sound measurement orassessment, where a description of sound transmission is involved,comprising the step of using binaural signals produced according to themethod of claim
 1. 56. The method according to claim 1, furthercomprising the steps of:sensing at least one property selected from thegroup consisting of (i) the position of the head of a listener, (ii)orientation of the head of a listener, (iii) changes in the position ofthe head of a listener, and (iv) changes in the orientation of the headof a listener; and modifying the electronic signal processing inresponse to the sensed property.
 57. The method according to claim 56,further comprising the steps of:transmitting at least one pulse ofenergy adapted to be received by receiving means mounted at andfollowing the movements of the head of the listener; detecting thearrival time of each of the transmitted energy pulses at the receivingmeans and optionally detecting or recording the time of transmission ofeach of the pulses; and c) calculating at least one of the position andorientation of the head of the listener based on the detected arrivaltime or times and optionally on the detected or recorded time or timesof the transmissions.
 58. The method according to claim 56, wherein themodification of the electronic signal processing is adapted to impart tothe listener the perception that virtual sound sources remain inposition irrespective of the sensed property of the listener's head. 59.The method according to claim 56, wherein the signal processing ismodified using an approximation method, wherein the HTF has beenobtained as an approximate HTF on the basis of a nearby HTF (B), byperforming an adjustment of the linear phase of the HTF (B) to obtainsubstantially the interaural time difference pertaining to the angle ofincidence for which the approximate HTF is intended, wherein an HTF (B)is defined as an HTF that has been determined for at least two testobjects, a test object being a person or an artificial head.
 60. Themethod according to claim 1, further comprising the step of transmittingthe binaural signals in the form of modulated ultrasonic waves, thewaves being received by a listener equipped with two receiving means,each of which is mounted close to the appertaining ear of the listener,with changes in the orientation of the listener's head relative to areference orientation being, compensated on the basis of the differenceof the travel time of the ultrasonic wave pulses between the tworeceiving means, so that the listener will perceive that virtual soundsources remain in a reference position irrespective of the orientationof the listener's head.
 61. The method of generating binaural signalsaccording to claim 1, wherein the sound inputs to be filtered byHead-related Transfer Functions are signals (A₁, . . . ,A_(n)) of acommunication system, which signals are adapted for being supplied to atleast one signal-to-sound transducer, so that the binaural signal, whenreproduced, is capable of imparting to a listener a perception oflistening to a spatial sound field with a set of n individuallypositioned transmitters, each of which transmits one of the signals (A₁,. . . ,A_(n)) and each of which corresponds to a virtual sound source.62. The method according to claim 61, wherein the position andorientation the listener's head are monitored, and head position andhead orientation data obtained in the monitoring are used to enable thelistener to selectively transmit a message to one of the transmitterscorresponding to one of the signals (A₁, . . . ,A_(n)) by turning his orher head in the direction of the virtual sound source corresponding tosaid transmitter.
 63. The method according to claim 61, wherein thesound inputs to be filtered by Head-related Transfer Functions aregenerated in connection with communicating with a multitude of units.64. The method of generating binaural signals according to claim 1,wherein the sound inputs to be filtered by Head-related TransferFunctions are signals (A₁, . . . ,A_(n)) of a multichannel soundreproducing system, which signals are adapted for being supplied to ndifferent signal-to-sound transducers of the multichannel soundreproducing system, so that the binaural signal, when reproduced, iscapable of imparting to a listener a perception of listening to aspatial sound field similar to the sound field that would have resultedfrom listening to the n signal-to-sound transducers spatially arrangedin a room.
 65. The method according to claim 64, wherein themultichannel sound reproducing system is selected from the groupconsisting of a Dolby® Surround System and an N channel sound systempertaining to HDTV.
 66. The method according to claim 64, wherein themultichannel sound reproducing system is a stereo system.
 67. The methodaccording to claim 1, wherein the binaural signals are used forpositioning a set of sounds at specific virtual positions in relation toan operator.
 68. The method according to claim 67, wherein a movingvirtual sound source with a characteristic sound moves between specificpositions of a set of virtual sound sources, the operator being enabledto communicate a specific message to the system according to aparticular virtual sound source by prompting the system when the movingvirtual sound source is positioned substantially at the position of saidparticular virtual sound source.
 69. The method according to claim 68,wherein the position of the moving virtual sound source is controlled bythe operator.
 70. The method according to claim 68, wherein the positionof the moving virtual sound source is controlled by the orientation ofthe head of the operator.
 71. The method according to claim 67, whereinthe positions are dynamically controlled by a computer.
 72. The methodaccording to claim 71, when used for controlling the movement of anobject by dynamically positioning a virtual sound source in relation tothe object, so as to guide the object in relation to the position of thevirtual sound source.
 73. The method according to claim 1, furthercomprising the step of compensating transfer characteristics of asignal-to-sound transducer.
 74. The method according to claim 73,wherein sound pressure at the entrance, or close to the entrance, to ablocked ear canal is considered as the output of the signal-to-soundtransducer.
 75. The method according to claim 1, wherein the binauralsignal is emitted by means of headphones.
 76. The method according toclaim 75, wherein the binaural signal is transmitted to the headphonesby wireless means.
 77. The method according to claim 74, furthercomprising the step of compensating for the difference in pressuredivision at the input to the ear canal when the ear is respectivelyoccluded and unoccluded by a headphone.
 78. The method according toclaim 77, wherein a description of the difference in pressure divisionat the input to the ear canal when the ear is respectively occluded andunoccluded by a headphone is obtained by:(a) measuring the transmissionfrom the headphone to the sound pressure (i) at the entrance, or closeto the entrance, of the blocked ear canal, and (ii) at the entrance, orclose to the entrance, of the open ear canal, the ratio of the frequencydomain descriptions of these transmissions being obtained ascharacteristic of a first pressure division "X"; (b) measuring thetransmission from a sound source that does not influence the acousticradiation impedance of the ear, to the sound pressure (i) at theentrance, or close to the entrance, of the blocked ear canal, and (ii)at the entrance, or close to the entrance, of the open ear canal, theratio of the frequency domain descriptions of these transmissions beingobtained as characteristic of a second pressure division "Y"; and (c)obtaining the ratio X/Y which constitutes the frequency domaindescription of the difference in pressure division.
 79. The methodaccording to claim 1, wherein the binaural signal is emitted by means ofloudspeakers.
 80. The method according to claim 1, wherein the step ofcompensating is adapted to the individual listener.
 81. The methodaccording to claim 1, wherein the binaural signal is stored in an audiostorage medium.
 82. The method according to claim 49, wherein thebinaural signal is stored in an audio storage medium, and wherein eachof the second sound inputs to be filtered by Head-related TransferFunctions representing a combination of more than one of the first soundinputs is stored separately, the binaural filtering being carried outbefore or after storing.
 83. A method of computer modeling or analyzingthe cerebral human binaural sound localization ability, comprising thestep of using binaural signals obtained according to the method ofclaim
 1. 84. A method of computer modeling or analyzing the cerebralhuman binaural sound localization ability, comprising the step of usingHTFs as characterized in claim
 1. 85. A method for designing headphones,comprising the step of adapting the transfer characteristics thereof toresemble an HTF, as characterized in claim 1, for a given direction orto resemble weighted averages of such HTFs corresponding to averages ofgiven directions.
 86. An artificial head having HTFs which correspondsubstantially to HTFs according to claim 1 for at least angles of soundincidence which constitute part of the total sphere surrounding theartificial head.
 87. A method for producing an artificial head havingHTFs which correspond substantially to HTFs according to claim 1 for atleast angles of sound incidence which constitute part of the totalsphere surrounding the artificial head, comprising the step of adaptingthe geometric characteristics of the artificial head so as toapproximate the HTFs of the artificial head to HTFs according to claim 1at least for angles of sound incidence which constitute part of thetotal sphere surrounding the artificial head.